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Abstract 

The Asymmetric Traveling Salesperson Path (ATSPP) problem is one where, given an asymmetric 
metric space {V, d) with specified vertices s and t, the goal is to find an s-t path of minimum length that 
passes through all the vertices in V. 

This problem is closely related to the Asymmetric TSP (ATSP) problem, which seeks to find a tour 
(instead of an s-t path) visiting all the nodes: for ATSP, a p -approximation guarantee implies an 0{p)- 
approximation for ATSPP. However, no such connection is known for the integrality gaps of the linear 
programming relaxations for these problems: the current-best approximation algorithm for ATSPP is 
0(log«/loglogn), whereas the best bound on the integrality gap of the natural LP relaxation (the subtour 
ehmination LP) for ATSPP is C»(logn). 

In this paper, we close this gap, and improve the current best bound on the integrality gap from 
O(logn) to (9 (log n/ log log n). The resulting algorithm uses the structure of narrow s-t cuts in the LP 
solution to construct a (random) tree witnessing this integrality gap. We also give a simpler family of 
instances showing the integrality gap of this LP is at least 2. 

1 Introduction 

In the Asymmetric Traveling Salesperson Path (ATSPP) problem, we are given an asymmetric metric space 
{V,d) (i.e., one where the distances satisfy the triangle inequality, but potentially not the symmetry condi- 
tion), and also specified source and sink vertices s and t, and the goal is to find an s-t Hamilton path of 
minimum length. 

This ATSPP problem is a close relative of the Asymmetric TSP problem (ATSP), where the goal is to find a 
Hamilton tour instead of an s-t path. For this ATSP problem, the log2«- approximation of Frieze, Galbiati, 
and Maffioli [9] from 1982 was the best result known for more than two decades, until it was finally improved 
by constant factors in [4, 11, 8]. A breakthrough on this problem was an 0( lo'^iogn ) -approximation result 
due to Asadpour, Goemans, Madry, Oveis Gharan, and Saberi [2]; they also bounded the integrality gap of 
the subtour elimination hnear programming relaxation for ATSP by the same factor. 
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torial Optimization, 2013. 
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Somewhat surprisingly, the study of ATSPP has been of a more recent vintage: the first approximation 
algorithms appeared only around 2005 [13, 6, 8]. It is easily seen that the ATSP reduces to ATSPP in 
an approximation preserving fashion (by guessing two consecutive nodes on the tour). In the other direc- 
tion, [8] showed that a p -approximation to the ATSP problem implies an C?(p) -approximation to the ATSPP 
problem. Using the above-mentioned 0{ i^°f^^„ ) -approximation for ATSP [2], this implies an 0{j^^^^)- 
approximation for ATSPP as well. 

The subtour elimination linear program generalizes simply to the ATSPP problem and is given in Section 2. 
However, the best previous integrality gap for this LP for ATSPP was 0{\ogn) [10]. In this paper we show 
the following result. 

Theorem 1.1. The integrality gap of the subtour elimination linear program for the ATSPP problem is at 
^ostO{^^^). 

We also give a simple construction showing that the integrality gap of this LP is at least 2; this example 
is simpler than previous known integrality gap instance showing the same lower bound, due to Charikar, 
Goemans, and Karloff [5]. 

Given the central nature of linear programs in approximation algorithms, it is useful to understand the 
integrality gaps for linear programming relaxations of optimization problems. Not only does this study give 
us a deeper understanding into the underlying problems, but also upper bounds on the integrality gap of 
LPs are often required for some reductions to go through. For example, the polylogarithmic approximation 
guarantees in the work of Nagarajan and Ravi [14] for Directed Orienteering and Minimum Ratio Rooted 
Cycle, and those in the work of Bateni and Chuzhoy [3] for Directed ^-Stroll and Directed ^-Tour were 
all improved by a factor of log logn following the improved bound of 0{ ^^°f"^^^ ) on the integrality gap of 
the subtour LP relaxation for ATSP. Note that these improvements do not follow merely from improved 
approximation guarantees. 

1.1 Our Approach 

Our approach to bound the integrality gap for ATSPP is similar to that for ATSP [2], but with some crucial 
differences. We sample a random spanning tree and then augment the directed version of this tree to an 
integral circulation using Hoffman's circulation theorem while ensuring the t-s edge is only used once. 
Following the corresponding Eulerian circuit and deleting the t-s edge results in a spanning s-t walk. 

However, the non-Eulerian nature of the ATSPP problem makes it difficult to satisfy the cut requirements 
in Hoffman's circulation theorem if we sample the spanning tree directly from the distribution given by the 
LP solution. It turns out that the problems come from the s-t cuts U that are nearly-tight: i.e., which satisfy 
\ <x* {d^{U)) < 1 + T for some small constant T — these give rise to problems when the sampled spanning 
tree includes more than one edge across this cut. Such problems also arise in the symmetric TSP paths case 
(studied in a recent paper of An, Kleinberg, and Shmoys [1]): their approach is again to take a random tree 
directly from the distribution given by the optimal LP solution, but in some cases they need to boost the 
narrow cuts, and they show that the loss due to this boosting is small. 

In our case, the asymmetry in the problem means that boosting the narrow cuts might be prohibitively 
expensive. Hence, our idea is to preprocess the distiibution given by the LP solution to tighten the narrow 
cuts, so that we never pick two edges from a narrow cut. Since the original LP solution lies in the spanning 
tree polytope, lowering the solution on some edges means we need to raise other edges, which causes the 
costs to increase, and technical heart of the paper is to ensure this can be done with little extra loss. 
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1.2 Other Related Work 



The first non-trivial approximation for ATSPP was an C?(-y/?i)-approximation by Lam and Newman [13]. 
This was improved to C?(log?i) by Chekuri and Pal [6], and the constant was further improved in [8]. The 
paper [8] also showed that ATSP and ATSPP had approximability within a constant factor of each other. 
All these results are combinatorial and do not bound integrality gap of ATSPP. A bound of 0{^/n) on the 
integrality gap of ATSPP was given by Nagarajan and Ravi [15], and was improved to C?(log«) by Friggstad, 
Salavatipour and Svitkina [10]. Note that there is still no result known that relates the integrality gaps of the 
ATSP and ATSPP problems in a black-box fashion. 

In the symmetric case (where the problems become TSPP and TSP respectively), constant factor approxi- 
mations and integrality gaps have long been known. We do not survey the rich body of literature on TSP 
here, instead pointing the reader to, e.g., the recent paper on graphical TSP by Sebo and Vygen [17]. It is, 
however, important to mention the the recent 1.618-approximation for TSPP in a beautiful new result by 
An, Kleinberg, and Shmoys [1]. They proceed via bounding the integrality gap of the LP relaxation, and 
their algorithm also proceeds via studying the narrow s-t cuts; the connections to their work are discussed 
in Section 1.1. 

1.3 Notation and Preliminaries 

Given a directed graph G = {V,A), and two disjoint sets U, U' C V, let d {U; U') =An{U x U'). We use the 
standard shorthand that d+{U) := d{U;V\ U), and d'~{U) ■.= d{V\ U; U). When the set ?7 is a singleton 
(say U = {u}), we use d^{u) or instead of d^{{u}) or d^{{u}). For undirected graph H = {y,E), 

we use d{U ; U') to denote edges crossing between U and U' , and d{U) to denote the edges with exactly one 
endpoint in U (which is the same a.?, diy \ U)). 

For a digraph G = {V,A), a set of arcs B C A is weakly connected if the undirected version of B forms a 
connected graph that spans all vertices in A. 

For values S M for all a S A, and a set of arcs B C A, we let x{B) denote the sum Y^aeB^a- 

Given an undirected graph H = iV^E), we let Xt G {0, 1}'^' denote the characteristic vector of a spanning 
tree T, then the spanning tree polytope is the convex hull of {xt \ T spanning tree of//}. See, e.g., [16, 
Chapter 50] for several equivalent linear programming formulations of this polytope. We sometimes abuse 
notation and call a set of directed arcs T a tree if the undirected version of T is a tree in the usual sense. 

An directed metric graph on vertices V has arcs A = (2) where the non-negative arc costs satisfy the triangle 
inequality c„v + Cvw > Cuw for all u,v,w G V . However, arcs uv and vu need not have the same cost. An 
instance of the ATSPP problem is a directed metric graph along with distinguished vertices s ^t. 

2 The Rounding Algorithm 

In this section, we give the linear programming relaxation for the Asymmetric TSP Path problem, and show 
how to round it to get a path of cost at most 0{ ^^°f"^^ ) times the cost of the optimal LP solution. We then 
give the proof, with some of the details being deferred to the following sections. 

Given a directed metric graph G = iy,A) with arc costs {ca\aeA, we use the following standard linear 
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programming relaxation for ATSPP which is also known as the subtour elimination linear program. 



minimize : ^ c„Xa (ATSPP) 

aeE 

s.t. : x{d+{s))=x{d-{t)) = l (1) 

x{d-{s))=x{d+{t))=0 (2) 

x{d+{v))=x{d-{v)) = l yveV\{s,t} (3) 

x{d+{U))>l y{s}<^U<ZV (4) 



Xa>0 M a£E 



We begin by solving the above LP to obtain an optimal solution x* . Consider the undirected (multi)graph 
H = {V,E) obtained by removing the orientation of the arcs of G. That is, create precisely two edges between 
every two nodes u,v € V in H, one having cost c„v and the other having cost Cv,,. (Hence, \E\ = \A\.) For a 
point w G M.'^, let k(w) denote the corresponding point in M^, and view k(w) as the "undirected" version of 
w. 

We will use the following definition: An s-t cut is a subset U CV such that {s} C f/ C V \ {?}. The LP 
constraints imply that a;*(5+(?7))-;c*(5"(?7)) = 1 for every 5-? cut ?7. Also, x* {d+{U)) =x*{d-{U)) > 1 
for every nonempty U '^V\{s,t}. 

Definition 2.1 (Narrow cuts). Let T > 0. An s-t cut U is r-narrow if x*{d^{U)) < 1 + T (or equivalently, 

x*{d-{U))<r). 

The main technical lemma is the following: 

Lemma 2.2. For any T € [0, 1 /4], one can find, in polynomial-time, a vector z G [0, 1]"^ ( over the directed 
arcs) such that: 

(a) its undirected version k{z) lies in the spanning tree polytope for H , 

(b) Z< j^iyi^* (where the inequality denotes component-wise dominance), and 

(c) z{d^{U)) = 1 and z{d^{U)) = Ofor every T-narrow s-t cut U. 

Before we prove the lemma (in Section 2.1), let us sketch how it will be useful to get the ATSPP. Since 
z (or more correctly, its undirected version k{z)) lies in the spanning tree polytope, it can be represented 
as a convex combination of spanning trees. Using some recently-developed algorithms (e.g., those due 
to [2, 7]) one can choose a spanning tree that crosses each cut only 0{ ^^°f"^^J times more than the LP 
solution. Finally, we can use 0{ ^^°f^^^ ) times the LP solution to patch this tree to get an s-t path. Since the 
LP solution is "weak" on the narrow cuts and may contribute very little to this patching (at most t), it is 
crucial that by property (c) above, this tree will cross the narrow cuts only once, and that too, it crosses in the 
"right" direction, so we never need to use the LP when verifying the cut conditions of Hoffman's circulation 
theorem on narrow cuts. The details of these operations appear in Section 3. 

2.1 The Structure of Narrow Cuts 

We now prove Lemma 2.2: it says that we can take the LP solution x* and find another vector z such that if 
a s-t cut is narrow in x* (i.e., the total x* value crossing the cut lies in [1 , 1 + t), then z crosses it to an extent 
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precisely 1. Moreover, the undirected version of z. can be written as a convex combination of spanning trees, 
and Za is not much larger than x* for any arc a. 

Note that the undirected version of x* itself can be written as a convex combination of spanning trees, so if 
we force z to cross the narrow cuts to an extent less than x* (loosely, this reduces the connectivity), we'd 
better increase the value on other arcs. To show we can perform this operation without changing any of the 
coordinates by very much, we need to study the structure of narrow cuts more closely. (Such a study is done 
in the symmetric TSP path paper of An et al. [1], but our goals and theorems are somewhat different.) 

First, say two s-t cuts U and W cross if U\W and W\U are non-empty. 
Lemma 2.3. For T < 1/4, no two z-narrow s-t cuts cross. 
Proof. Suppose U and W are crossing T-narrow s-t cuts. Then 

2 + 2T > x*{d+{U))+x*{d+{W)) 

= x*{d+{u\w))+x*{d+{w\u))+x*{d+{ur\W)) 

+x* {d{u r\W;V \ {u uw))) - X* {d{u ^w)\{u r\W)\U r\W) 

> 1 + 1 + 1+0-2T 
= 3-2t 

where the last inequality follows from the first three terms being cuts excluding t and hence having at least 
unit jc*-value crossing them (by the LP constraints), the fourth term being non-negative, and the last term 
being the ;c* -value of subset of the arcs in d~{U) U^~(W) and remembering that U and W are T-narrow. 
However, this contradicts T < 1/4. □ 

Lemma 2.3 says that the T-narrow cuts form a chain {s} = U\ C U2 C . . . C = V \ {?} with k >2. For 
\<i<k. let Li := Ui \ Ui-i. We also define U = {s] and U+\ = {t}. LetL</ := U;=i U and L>,- := Uji; U 
For the rest of this paper, we will use T to denote a value in the range [0, 1 /4]. Ultimately, we will set T := 1/4 
for the final bound but we state the lemmas in their full generality for T < 1/4. 

Next, we show that out of the (at most) 1 + T mass of x* across each T-narrow cut Ui, most of it comes from 
the "local" arcs in 5(L, ;L,+i). 

Lemma 2.4. For each \<i< k; x* {d{Li;Li^\)) > 1 - 3t. 

Proof. For / = 1, since s,t L2 then we have x*{d^{L2)) > 1 from the LP constraints. We also have 
X* {d^ {U2)) < f since U2 is T-narrow, so at least 1 — T of x*{d^{L2)) comes from x*{d{Li;L2)). A similar 
argument for / = k shows x*{d{Lii;Lk+i)) > 1 — T. So it remains to consider I <i <k. Define the following 
quantities, some of which can be zero. 

. A=x*id{Li;Li+i)) 

• B = x*id{Li;L>i+2)) 

• C = x*{d{L<i-i;Li+i)) 

We have 

1 <x*{d+{Li)) =A + B+x*{d{Li;L<i^i)) <A+B + z, 
since d{Li;L<i^\) C d^{Ui-\) and ?7,_i is T-narrow. Similarly 

1 < X* {d-{Li+i) )=A+C+x*{d (L>,-+2; ) ) < A + C + T. 
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Summing these two inequalities yields 2<A + (A + B + C) + 2t<A + (1 + t)+2t where we have used 
A+B + C <x*{d+{Ui)) < 1 + T. Rearranging shows A > 1-3t. □ 

Now, recall that k{x*) denotes the assignment of arc weights to the graph H = {V,E) from the previous 
section obtained by "removing" the directions from arcs in A. We prove that the restriction of k:{x*) to any 
L, almost satisfies the partition inequalities that characterize the convex hull of connected graphs. For a 
partition n = {Wi ,We}, we let d{n) denote the set of edges whose endpoints he in two different sets in 
the partition. 

Lemma 2.5. For any 1 < i < k + \ and any partition % = {W\^. . . ,Wf\ of Li, we have K{x*)[d{ll)) > 
e-l-2z. 

Proof. Since Li = {s} and L^+i = {t}, then there is nothing to prove for / = 1 or / = ^ + 1. So, we suppose 
\<i<k+l. 

Consider the quantity X = L;=i-^*(^^(^i)) On one hand, since neither s nor t are in any 

Wj, then x*{d+{Wj)) = x*{d^(Wj)) > \ so X > 21. On the other hand, X counts each arc between two 
partitions in 7i exactly twice and each arc with one end in L, and the other not in L; precisely once. So, 

X = 2K{x*){d{K)) +x*{d+{Li)) +x*{d-{Li)). 

Notice that 5+(L,) and 5"(L,) are disjoint subsets of 5+(?7,_i) U d^{Ui-i) U d+{Ui) U {{]{). So, since 
both Ui-\ and Ui are T-narrow, then;c(5+(L,)) +x(5"(L,)) < 2 + 4t. This shows 2i<X < 2K{x*){d{Ti)) + 
2 + Ax which, after rearranging, is what we wanted to show. □ 

The following corollary will be useful. 

Corollary 2.6. For any partition % of Li, we have -^\'^"^^^'>' > \7i\-\. 

Proof. From Lemma 2.5, we have )) > \^^t±^ >\K\-l for any |7r| > 2. □ 

Finally, to efficiently implement the arguments in the proof of Lemma 2.2, we need to be able to efficiently 
find all T-narrow cuts Ui. This is done by a standard recursive algorithm that exploits the fact that the cuts 
are nested. 

Lemma 2.7. There is a polynomial-time algorithm to find all T-narrow s — t cuts. 

Proof. Consider following standard recursive algorithm. As input, the routine is given a directed graph 
H = (V',A') with arc weights x* and distinct nodes s' ,t' where both {s'} and V' \ {t'} are T-narrow. Say a 
T-narrow cut U mH is non-trivial ifU ^ {/} and U ^V'\ {t'}. The claim is that the procedure will find 
all non-trivial T-narrow s — t cuts of H, provided that they are nested. 

The procedure works as follows. If there are non-trivial T-narrow s — t cuts in H, then there are nodes 
M, V G V \ {s' ,t'} such that some T-narrow s' — t' cut U has {s' ,u} QU Q'V'\ v}. So, the procedure tries 
all C?(|V'p) pairs of distinct nodes m, v, contracts both {s\u] and {?',v} to a single node and determines if the 
minimum cut separating these contracted nodes has capacity less than 1 + T. If such a cut U was found for 
some u,v, the algorithm makes two recursive calls, one with the contracted graph H^' /U] with start node 
being the contraction of U and end node being t' , and the other with the contracted graph H\y' / {V' \U)] with 
start node s' and end node being the contraction of V' \ U. After both recursive calls complete, the algorithm 
returns all T-narrow cuts found by these two recursive calls (of course, after expanding the contracted nodes) 
and the T-narrow cut U itself. If such a cut U was not found over all choices of m, v, then the algorithm returns 
nothing since there are no non-trivial T-narrow s' — t' cuts in H. 
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It is easy to see that a non-trivial T-narrow cut in eitiier contracted graph corresponds to a T-narrow cut in H. 
On the other hand, if the T-narrow s' — t' cuts are nested in H, then every non-trivial T-narrow s' — t' cut apart 
from U itself corresponds to a non-trivial T-narrow cut in exactly one of H^' /U] or //[V/ iV' \ U)]. Also, 
the T-narrow cuts in both contracted graphs remain nested. So, the recursive procedure finds all non-trivial 
T-narrow cuts of H. The number of recursive calls is at most the number of non-trivial T-narrow cuts, and 
this is at most | V' \ because the cuts are nested so it is an efficient algorithm. We call this algorithm initially 
with graph G, start node s and end node t. Lemma 2.3 implies the T-narrow s — t cuts of G are nested so 
the recursive procedure finds all non-trivial T-narrow cuts of G. Adding these to {s} and V \ {t} gives all 
T-narrow cuts of G. □ 

We are now in a position to prove Lemma 2.2, the main result of this section. 

Proof of Lemma 2.2. The claimed vector z can be described by linear constraints: indeed, consider the 
following LP on the variables z where constraints (5) imply that k{z) is in the convex hull of spanning 
connected graphs [16, Corollary 50.8a]. ^ 



K{z){d{K)) >\n\-\ 

^ 1 * 

^« — T^-^fl 
z{d\Ut)) = 1 

zid-m)=o 

Za>0 



V partitions K ofV 
\/ a^A 

V T-narrow s-t cuts Ui 

V T-narrow s-t cuts Uj 

Va GA 



(5) 
(6) 
(V) 
(8) 
(9) 



We demonstrate a feasible z as follows. 



Za 



x*{d{L,;Li+l)) 

Xg 

\-lx 




if a € 5(L;;L,+i) for some /; 
if c? G E[Li] for some /; 
otherwise. 



(10) 



We claim that this solution z satisfies the above constraints. Constraints (8) and (9) are satisfied by con- 
struction. Constraint (6) follows from Lemma 2.4 for edges in 5(L, ;L,+i) and by construction for rest of the 
edges. For constraint (7), note that 

z{d+m) = z{d{Lr,L,^,))+z{d+{ud\d{Lr,Li+i)) = ""^l^i^':^'^'!! +0 = 1. 

To complete the proof, we now show constraints (5) holds. It suffices to show that k{z) can be decomposed 
as a convex combination of characteristic vectors of connected graphs. For 1 < / < A: + 1, let z' denote the 
restriction of k{z) to edges whose endpoints are both contained in L,-. Then Corollary 2.6, constraints (9), 
and [16, Corollary 50.8a] imply that z' can be decomposed as a convex combination of integral vectors, 
each of which corresponds to an edge set that is connected on L,. Next, let z' denote the restriction of k{z) 
to edges whose endpoints are both contained in some common L,-. Since the sets E{L\),. . . ,E{Li^^i) are 
disjoint, we have that z' = (where the addition is component- wise). Furthermore, z', being the sum of 

'The statement of Lemma 2.2 makes a claim about k{z) being in the convex hull of spanning trees and not spanning connected 
graphs. However, the equivalent statement for spanning trees will follow by dropping some edges from the connected subgraphs in 
the decomposition of z to get spanning trees. Constraints (7) and (8) will still be satisfied since we retain connectivity. 
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the z' vectors, can be decomposed as a convex combination of integral vectors corresponding to edge sets E' 
such that the connected components of the graph H' = {y,E') are precisely the sets {L,}^^/. 

Next, let t!' denote the restriction of k{z) to edges contained in one such 5(L,;L,+i). We also note that the 
sets d{L\;L2),. ■ ■ , d {L/^; L/^^i) are disjoint. By construction, we have z"(5(L,;L,+i)) = 1 for each I <i < k 
so we may decompose z" as a convex-combination of integral vectors, each of which includes precisely one 
edge across each 5(L, ;L,+i). 

Now, adding any integral point y' in the decomposition of z' to any integral point y" in the decomposition 
of z" results in an integral vector that corresponds to a connected graph: each L,- is connected by y' and 
consecutive L,- are connected by By construction of z, we have k{z) = z! + z" so we may write z as a 
convex combination of characteristic vectors of connected graphs, each of which satisfies constraints (5). 

To see why z can be found efficiently, we first compute all T-narrow cuts using Lemma 2.7. Then z is easy 
to compute in equation (10). Finally, [16, Corollary 51.6a] implies the decomposition of k{z) into a convex 
combination of connected graphs can be done efficiently, so the arguments in the footnote to reduce z such 
that k{z) is in the spanning tree polytope can also be done efficiently. □ 

3 Obtaining an s-t Path 

Having transformed the optimal LP solution x* into the new vector z (as in Lemma 2.2) without increasing 
it too much in any coordinate, we now sample a random tree such that it has a small total cost, and that the 
tree does not cross any cut much more than prescribed by x* . Finally we add some arcs to this tree (without 
increasing its cost much) so that it is Eulerian at all nodes except {s,t}, and hence gives us an Eulerian s-t 
walk. By the triangle inequality, shortcutting this walk past repeated nodes yields a Hamiltonian s — t path 
of no greater cost. While this general approach is similar to that used in [2], some new ideas are required 
because we are working with the LP for ATSPP — in particular, only one unit of flow is guaranteed to cross 
s-t cuts, which is why we needed to deal with narrow cuts in the first place. The details appear in the rest of 
this section. 

3.1 Sampling a Tree 

For a collection of arcs £^ C A, we say £^ is a-thin with respect to x* if r\d^{U)\ < ax* {d^ {U )) for 
every C C V. The set £/ is also ^ -approximate with respect to x* if the total cost of all arcs in is at 
most j8 times the cost of x* — i.e., Y.aes/'^a < PHaeA^a^l- The reason we are deviating from the undirected 
to the directed setting is that the orientation of the arcs across each T-narrow cut will be important when we 
sample a random "tree". 

Lemma 3.1. Let t G [0, 1 /4]. Let p = and a = (2 + • For sufficiently large n, there is a ran- 

domized, polynomial time algorithm that, with probability at least 1/2, finds an a-thin and ^ -approximate 
(with respect to x*) collection of arcs si that is weakly connected and satisfies n (5+(?7))| = 1 and 
|j2/n (5^(?7))| = ^ for each T-narrow s-t cut U. 

Proof. Let z be a vector as promised by Lemma 2.2. From K"(z), randomly sample a set of arcs s/ whose 
undirected version J7 is a spanning tree on V. This should be done from any distribution with the following 
two properties: 

(i) (Correct Marginals) Fr[e e ^] = K{z)e 

(ii) (Negative Correlation) For any subset of edges F (IE, Pr[F C ^] < neGFPr[e G ^] 
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This can be obtained using, for example, the swap rounding approach for the spanning tree polytope given 
by Chekuri et al. [7]. As in [2], the negative correlation property implies the following theorem. The proof 
is found in Section 4. 

Theorem 3.2. The tree ^ is a-thin with high probability. 

By Lemma 2.2(b), property (i) of the random sampling, and Markov's inequality, we get that (from 
Lemma 3.1) is -approximate with respect to x* with probability at least 2/3. By a trivial union bound, 
for large enough n we have with probability at least 1/2 that is both a-thin and j3 -approximate with 
respect to x* . It is also weakly connected — i.e., the undirected version of (namely, =>^) connects all 
vertices in V . 

The statement for T-narrow s-t cuts follows from the fact that z satisfies Lemma 2.2(c). That is, .s/ contains 
no arcs of d^{U), since z{d^{U)) = (for U being a T-narrow s-t cut). But since is a spanning tree, 
must contain at least one arc from d^{U). Finally, since z{d^{U)) is exactly 1, then any set of arcs 
supported by this distribution we use must have precisely one arc from d^{U). □ 

3.2 Augmenting to an Eulerian s-t Walk 

Finally, we wrap up by augmenting the set of arcs £/ to an Eulerian s-t walk. For this, we use Hoffman's 
circulation theorem, as in [2], which we recall here for convenience (see, e.g, [16, Theorem 11.2]): 

Theorem 3.3. Given a directed flow network D = {V,A), with each arc having a lower bound ia cind an 
upper bound Ua (and < < Ua), there exists a circulation f : A ^ satisfying £a < f{a) < Uafor all 
arcs a if and only if i(d^{U)) < u{d^ (U)) for all U C.V. Moreover, if the £ and u are integral, then the 
circulation f can be taken integral. 

Set lower bounds ^ : A — )• {0, 1} on the arcs by: 



For now, we set an upper bound of 1 on arc ts and leave all other arc upper bounds at oo. We compute 
the minimum cost circulation satisfying these bounds (we will soon see why one must exist). Since the 
bounds are integral and since £/ is weakly connected, this circulation gives us a directed Eulerian graph. 
Furthermore, since Uta = (ta = ^, the ts arc must appear exactly once in this Eulerian graph. Our final 
Hamiltonian s-t path is obtained by following an Eulerian circuit, removing the single ts arc from this 
circuit to get an Eulerian s-t walk, and finally shortcutting this walk past repeated nodes. The cost of this 
Hamiltonian path will be, by the triangle inequality, at most the cost of the circulation minus the cost of the 
ts arc. 

Finally, we need to bound the cost of the circulation (and also to prove one exists). To this end, we will 
impose further upper bounds m : A — M>o as follows: 



We use Hoffman's circulation theorem to show that a circulation / exists satisfying these bounds i and 
u (The calculations appear in the next paragraph.) Since u is no longer integral, the circulation / might 




1 if a £ or a = ts 
otherwise 



ua= I 1 + (1 + T ^)ax* 
{\ + r-')ax: 



if a = ts 
ifa££^ 
otherwise 
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not be integral, but it does demonstrate that a circulation exists where each arc a ^ ts is assigned at most 
(1 + z^^)ax*a more flow in the circulation than the number of times it appears in £/. Consequently, it 
shows that the minimum cost circulation g in the setting where we only had a non-trivial upper bound of 
1 on the arc ts can be no more expensive (since there are fewer constraints), and that circulation g can be 
chosen to be integral. The cost of circulation g is at most the cost of /, which is at most 

aeA a€£^ aeA 

Subtracting the cost of the ts arc (since we drop it to get the Hamilton path) and recalling that £/ is 
approximate with respect to x* (and hence Lae.c/ Ca < HaeA CaXa, we get that the final Hamiltonian path 
has cost at most 

and hence 0( io'giog,i ) times the cost of the LP relaxation for T = 1/4. This proves the claim that the cost of 
the s-t path we found is 0{ ^^°f^^^ ) times the LP value, with constant probability, and completes the proof of 
Theorem 1.1. 

One detail remains: we need to verify the conditions of Theorem 3.3 for the bounds £ and u. Firstly, it is 
clear by definition that ia < for each arc a. Now we need to check i{d^{U)) < u{d^{U)) for each cut 
U. This is broken into four cases (where saying U is a. u-v cut means u ^U,v ^ U). 

1. [/ is a T-narrow s-t cut. Then i{d^{U)) = 1, since contains only one arc in 5 +(?/). But 1 = Wf, < 

2. U is an s-t cut, but not T-narrow. Then by the oc -thinness of £/, 

l{d+{U))<ax*{d+{U)) = ax*{d-{U)) + a. 

On the other hand, 

u{d-{U)) > {\ + r-^)ax*{d-{U)) = ax*{d~{U)) + T-^ax*{d~{U)) > ax*{d~{U)) + a 
where the last inequality used the fact that x*{d^{U)) > T. 

3. U is a. t-s cut. Then 

£{d+{U)) < \ + ax*{d+{U)) = 1 + ax*{d-{U)) - a < ax*{d-{U)), 
the last inequality using that a > 1 . Moreover 

u{d-{U)) > {l + T-^)ax*{d-{U)) > ax*{d-{U)). 
Theni{d+{U))<u{d-{U)). 

4. U does not separate s from t. Then 

£{d+{U)) < ax*{d+{U)) = ax*{d-{U)) < {\ + T-^)ax*{d-{U)) < u{d-{U)) 
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4 Guaranteeing a-Thinness 



We prove Theorem 3.2 in this section. Recall that a-thin means the number of arcs chosen from d^{U) 
should not exceed ax* {d^{U)) (so a directed version). Let a := (2 + ^) • jpg'j°g'^ where the logarithm is the 
natural logarithm. Recall that £/ is the set of arcs found with corresponding undirected spanning tree By 
the first property of the distribution (preservation of marginals on singletons) we have for each C C V 

th^tE[\d,y{U)\] = K{z){d{U)).^ 

Since we have negative correlation on subsets of items, then we can apply a Chernoff bound. For notational 
simplicity, let z' := Then we use 



Fr[\dHU)\>{l+5)z'{d{U))]< 



. \ <''(5(c/)) 
e \ 



(l + 5)(i+5) 



Let P := |og°og"„ (again using the natural logarithm) and let 5 := j3 — 1. For large enough n, the the above 
expression is bounded (in a manner similar to [2]) by 



Pz'{d{U)) 

<g-z'('5(f/))51og«^„-5z'{5([/))^ 



However, for any graph, there are at most n^' cuts whose capacity is at most / times the capacity of the 
minimum cut [12]. Since the minimum cut with capacities z' is 1, then there are at most rp-' cuts of the 
undirected graph H with capacity (under z') at most /. Another way to view this is that there are at most 
^2(/+i) ^.yjg whose capacity is between / and / + 1. For each such cut U, the previous analysis shows that 
probability that \d,^{U)\ > {\ + 5)z' {d (U)) is at most Thus, by the union bound, the probability that 
\d3r{U)\ > {l + 5)z'{d{U)) for some C [/ C V is bounded by 

OO OO 1 

yn^('+^^-n-"<yn-' = — 

Since < \d,^{U)\, then we have just seen that with probability at least 1 — that there is no 

%CUC.V with |5^(?7)| > pz'{d{U)). Now, there are three additional cases to consider: 

• x*{d^{U)) < X {so t ^ U and s U). We actually ignore the above analysis in this case and simply 
use the fact that z{d^{U)) = is guaranteed by construction of z. Then we trivially have \d^(U)\ = 
0<ax*{d+{U)) 

• teUands^U, but x*{d+{U)) > T. Since T < 1/4, then < 4 so z < 4x*. We have 

\d^{U)\ < Hz'idiu)) 

< 4Pix*id+{U))+x*{d-iU))) 
= 4/3(2x*(5+([/)) + l) 

= 8px*{d+{U))+4p 

< Spx*{d+{U)) + ^x*{d+{U)) 
= ax*{d+{U)) 



^Here we use d^{U) to denote the set (9((7) fl similarly we will let d^^{U) denote (9+(t/) n si/, etc. 
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Figure 1: The graph Gr with r = 5. The sohd edges have cost 1 and the dashed edges have cost 0. 



• Finally, if it is not the case that {t} C U Q V\{s}, then x*{d+{U)) +x*{d-{U)) < 2x*{d+{U)) so 
we get (using z < 4x* again) 

\d^{U)\ < Pz'{d{U)) 

< 4l5ix*{d+{U))+x*{d-{U))) 

< SI5x*{d+{U) 

< ax*{d+{U)) 

Summarizing, for sufficiently large n we have with probability at least 1 — that 

^+nTM ^ r.^*^;^+(rJ^^ log" 



That is, £/ is a-thin with high probability. 

5 A Simple Integrality Gap Example 

In this section, we show that the integrality gap of the subtour elimination LP (ATSPP) is at least 2. This 
result can also be inferred from the integrality gap of 2 for the ATSP tour problem [5], but our construction 
is relatively simpler. 

For a fixed integer r > 1, consider the directed graph G, defined below (and illustrated in Figure 1). The 
vertices of Gr are {s,t} U . . .,Ur}Ll{vi,. ■.,Vr}; the edges are as follows: 

• {su\,sv\,Ui.t,Vi.t}, each with cost 1, 

• {miv^jViM;.}, each with cost 0, 

• {ui+iUi I 1 < / < r} U {v,+iv,- I 1 < / < r}, each with cost 1, 

• and | 1 < / < r} U {v,v,+i | 1 < / < r}, each with cost 0. 

Let Fr denote the ATSPP instance obtained from the metric completion of Gr- 

Lemma 5.1. The integrality gap of the LP ATSPP on the instance F, is at least 2 — o(l). 

Proof. It is easy to verify that assigning Xa = l/2to each arc that originally appeared in Gr is a valid LP 
solution. Indeed, the degree constraints are immediate, and there are two edge-disjoint paths from s to every 
other node in G,- (so there must be at least 2 arcs exiting any subset containing s) so the cut constraints are 
also satisfied. The total cost of this LP solution is r + 1 . 
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On the other hand, we claim that the cost of any Hamiltonian s-t path in L, , which corresponds to a spanning 
s-t walk W in G,-, is at least 2r — 1. This shows an integrality gap of = 2 — o(l). 

To lower-bound the length of any spanning s-t walk, we first argue that the walk W can avoid using at most 
one of the unit cost edges of the form M;+im,- or v/+iV;. Indeed, any Ur-Vr walk must use edges m,+iM; for 
every 1 < j < r. Similarly, every v^-Wr walk must use all edges of the form v,+iv;. One of Ur and Vr is visited 
before the other, so either all of the m,+im; edges or all of the v,_|_iv,- edges are used by W. Now suppose, 
without loss of generality, that W does not use the edges Ui+iUi and uj+iuj for 1 < / < 7 < r. Every m+i-Vr 
walk uses edge m,+im,- and every Vr — Ui+\ walk uses edge Uj+\Uj. Since one of m/+i or Vy must be visited by 
W before the other, then W cannot avoid both m,+im, and uj+iuj which contradicts our assumption. 

Thus, W must use all but at most one of the 2r — 2 unit cost edges in {m,+im, | 1 < / < r} U {v,-|_iv,- | 1 < / < r}. 
Moreover, W must also use one of the arcs exiting s and one of the arcs entering t, so the cost of W is at 
least 2r — 1. (In fact, the walk 

{s,Ul,Vr,Vr-l,. . . , Vl , M^, "r-l , • ■ ■ ,U3,U2,Uj, ■ ■ ■ ,Ur,t) 

is of length exactly 2r — 1, so this argument is tight.) □ 

6 Conclusion 

In this paper we showed that the integrality gap for the ATSPP problem is 0{ ^^°f^^^ ). In fact, our proof 
also shows an integrality gap of a for ATSPP whenever we can construct a procedure which takes a point 

y £ R'^' in the spanning tree polytope of an undirected (multi)graph H = (y,E) and outputs a tree T that is 
(a) a-thin, and (b) also satisfies \ T n d {U) \ = I for any cut U where = 1. We also showed a simpler 

construction achieving a lower bound of 2 for the subtour elimination LP. 
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